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Abstract — Model-based prognostics approaches use domain 
knowledge about a system and its failure modes through the use 
of physics-based models. Model-based prognosis is generally 
divided into two sequential problems: a joint state-parameter 
estimation problem, in which, using the model, the health of a 
system or component is determined based on the observations; 
and a prediction problem, in which, using the model, the state- 
parameter distribution is simulated forward in time to compute 
end of life and remaining useful life. The first problem is 
typically solved through the use of a state observer, or filter. The 
choice of filter depends on the assumptions that may be made 
about the system, and on the desired algorithm performance. 
In this paper, we review three separate filters for the solution 
to the first problem: the Daum filter, an exact nonlinear filter; 
the unscented Kalman filter, which approximates nonlinearities 
through the use of a deterministic sampling method known 
as the unscented transform; and the particle filter, which ap- 
proximates the state distribution using a finite set of discrete, 
weighted samples, called particles. Using a centrifugal pump as 
a case study, we conduct a number of simulation-based experi- 
ments investigating the performance of the different algorithms 
as applied to prognostics. 
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1. Introduction 

Model-based prognostics approaches employ domain knowl- 
edge about a system, its components, and how they fail 
through the use of physics-based models, derived from first 
principles that capture the underlying physical phenomena. 
Model-based prognosis is generally divided into two sequen- 
tial problems: (i) a joint state-parameter estimation prob- 
lem, in which, using the model, the health of a system is 
determined based on the observations, and (ii) a prediction 
problem, in which, using the model, the state-parameter 
distribution is simulated forward in time to compute end of 
life (EOL) and remaining useful life (RUL). The first problem 
is typically solved through the use of a state observer, or 
filter. The choice of filter depends on the assumptions that 
may be made about the system, and on the desired algorithm 


performance and other characteristics. 

The most well-known filter is the Kalman filter, which is an 
exact filter for linear systems with additive Gaussian noise, 
i.e., it is the optimal filter for these types of systems. Kalman 
filters have been used for prognostics problems in [1,2], In 
many cases, however, especially with prognostics, systems 
are nonlinear, and the Kalman filter cannot be used. For spe- 
cial classes of nonlinear dynamics, exact filters are available 
such as the Benes filter [3] and the Daum filter [4] along 
with its variants. Such filters are advocated for prognostics 
applications in [5], although the classes of dynamics they 
support are quite restrictive in practice. Approximate filters 
may also be used, such as the extended Kalman filter and 
the unscented Kalman filter [6], which are restricted also 
to additive Gaussian noise. Another nonlinear filter is the 
particle filter, which approximates the state distribution using 
a finite set of discrete, weighted samples, called particles [7]. 
The particle filter does not restrict the dynamics or the noise 
in any way, but suffers from a high computational complexity. 
Still, particle filters, due to their general applicability and ease 
of implementation, have been a popular choice for nonlinear 
estimation within prognostics, e.g., [8-10], 

In this paper, we develop a general model-based prognostics 
methodology that uses filters for the joint state-parameter 
estimation step. We review three separate nonlinear filters: 
(0 the Daum filter, (ii) the unscented Kalman filter, and (Hi) 
the particle filter. Each of these approaches differs in the 
formulation of the underlying model, the assumptions that 
they make, their representation of the state-parameter distri- 
bution, the computational complexity, and the performance 
that may be achieved. Using a centrifugal pump as a case 
study, we conduct a number of simulation-based experiments 
investigating the performance trade offs of the different prog- 
nostic algorithms. Since the Daum filter does not support the 
class of dynamics that the centrifugal pump model falls in, we 
perform experiments only using the unscented Kalman filter 
and the particle filter. Algorithm performance is measured 
using established prognostics metrics [11]. 

The paper is organized as follows. Section 2 formally defines 
the prognostics problem and describes the model-based prog- 
nostics architecture. Section 3 describes different estimation 
methods and develops a flexible variance control scheme. 
Section 4 discusses the prediction methodology. Section 5 
describes the case study and provides results from a number 
of simulation-based experiments and evaluates the approach. 
Section 6 concludes the paper. 

2. Model-based Prognostics 

The problem of prognostics is to predict the EOL and/or the 
RUL of a component, subsystem, or system. We assume the 
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Figure 1. Prognostics architecture. 


system model may be generally defined as 

*(t) = f(t,x(t),9(t),u(t),v(t)) 
y (t) = h (t, x(t),9(t), u(i), n(f)), 

where x(f) € M” 1, is the state vector, 9(t) £ K" 8 is 
the unknown parameter vector, u (t) £ K n “ is the input 
vector, v(i) £ R"” is the process noise vector, f is the state 
equation, y(t) £ is the output vector, n (t) £ R n ' 1 is 
the measurement noise vector, and h is the output equation. 
This is a general nonlinear model with no restrictions on the 
functional forms of f or h, or on how the noise terms are 
coupled with the states and parameters. 

In prognostics, we are interested in when the performance 
of a system lies outside some desired region of acceptable 
behavior. Outside this region, we say that the system has 
failed. The desired performance is expressed through a set 
of c constraints, C = {Cj}?_ 1 , where C* is a function 


The model-based prognostics architecture is shown in Fig. 1 . 
In discrete time />:, the system is provided with inputs u /r 
and provides measured outputs y k . The damage estimation 
module uses this information, along with the system model, 
to compute an estimate p(x k , 9 k \yo-.k)- While the damage 
estimation module may be implemented in various ways, in 
this paper, we focus on filtering solutions to this problem. 
The prediction module uses the joint state-parameter distri- 
bution and the system model, along with hypothesized future 
inputs, to compute EOL and RUL as probability distributions 
p(EOL kp |y 0: fcp) and p{RUL kp \y 0:kp ) at given prediction 
times kp. A fault detection, isolation, and identification 
(FDII) module may be used in parallel to determine which 
damage mechanisms are active, represented as a fault set F, 
which may enable the damage estimation module to limit 
the dimension of the joint state-parameter space that must be 
estimated. In this paper, we assume such a module is absent. 


3. Damage Estimation 


Ci : R n * xK"Ml 

that maps a given point in the joint state-parameter space, 
(x(t), 9(t)), to the Boolean domain B = [0,1], where 
Ci{x(t),9(t)) = 1 if the constraint is satisfied. If 
Ci(x(t),9(t)) = 0, then the constraint is not satisfied and 
the system has failed. 

These individual constraints may be combined into a single 
threshold function Teol , where 

T E ol ■ x -> B, 

defined as 


In model-based prognostics, damage estimation reduces 
to joint state-parameter estimation, i.e., computation of 
p(x k ,9 k \ yo;fc). A common solution to this problem is to 
use a filter, where the unknown parameters simply augment 
the state vector. Treating parameters as states in most cases 
makes the system nonlinear, if not already so. In any case, 
most models for prognostics are nonlinear because damage 
progression models are rarely linear. Therefore, we focus 
here on nonlinear filters. Specifically, we consider the Daum 
filter [4], which is an exact nonlinear filter for a specific 
class of nonlinear functions, the unscented Kalman filter 
(UKF) [6, 12], which is an approximate nonlinear filter that 
uses deterministic sampling, and the particle filter (PF) [7], 
which is an approximate nonlinear filter that uses stochastic 
sampling. 


T E OL(x(t),9{t)) 


fl, 0£{C i (x(t),9(t))}U 
[0, otherwise. 


Daum Filter 

With the Daum filter, the system model is defined by 


That is, Teol evaluates to 1, i.e., the system has failed, when 
any of the constraints are violated. 

At some point in time, tp , the system is at (x(fp), 9(tp)) and 
we are interested in predicting the time point t at which this 
state will evolve to (x(i), 9(t.)) such that Tpop(x(f), 9(t)) = 
1. Using Teol , we formally define EOL with 

EOL(tp) = inf{t £ ffi. : t > tp A TEOL{*-(t),9(t)) = 1}, 

i.e., EOL is the earliest time point at which Teol is met. 
RUL is expressed using EOL as 


x(f) = f(f,x(f),u(f)) + G(f)— (1) 

y(f) = H(f)x(f) + n(f), (2) 

where G(t) is the process noise matrix (with the covariance 
of v being I), and H(i) is the observation matrix. The sensor 
noise has covariance R. Note the following restrictions on 
the system model. First, the process noise is restricted to 
be additive and Brownian. Second, the output equation is 
assumed to be linear, and the sensor noise is assumed to be 
additive and Gaussian. 


RUL(t P ) = EOL(t P ) — t P . 
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The system model is restricted further to satisfy certain 
assumptions. First, there must exist a function 







that satisfies the following Fokker-Planck partial differential 
equation (PDE): 


dip 

~dt 


dip 

dx 


f — ip tr 



1 

2 tt 



(3) 


where Q = GG T . Further, there must exist A, b, c, D, and 
E such that 


tr 



+ (s/2)rQr T = x 1 Ax + h 1 x + c 


f sQr T = Dx + E, 


(4) 

(5) 


Kalman filter (EKF), which approximates the Kalman fil- 
ter by linearizing the dynamics around the operating point. 
However, when the model dynamics are highly nonlinear, 
the performance of the EKF suffers [6], Further, the EKF 
requires the derivation of Jacobians, which may be difficult 
to derive analytically, or expensive to compute online, and 
in some cases do not exist (e.g., when there are discontinu- 
ities). In the most basic sense, the EKF linearizes the model 
at each step and then applies the Kalman filter equations. 
The unscented Kalman filter, instead of approximating the 
nonlinearity, approximates the state distribution [6, 12], This 
procedure maintains the nonlinear functions exactly, elimi- 
nating the need to calculate Jacobians, and thereby providing 
an easier implementation framework. 


where 0 < s < 1, and r(x, t) = d/dx(lnip(x, t)). 

If these conditions hold, in addition to some additional as- 
sumptions (see [4]), then we have 

p(xfc|y 0: fc) = ip(x k ,k)- 

exp(-i(x fc - m fc ) T P^ 1 (x fc - m fc )). (6) 

Here, m and P are the filter variables and are analogous to 
mean and covariance. The filter computes them using 

m fc = 111 ,.; + P k U k R^ 1 (y k - H k m fe ) (7) 

P fc = P fe - PfcH^ (HfcPfcHjT + Rfe) _1 HfcPfc. (8) 


The UKF approximates a distribution using the unscented 
transform (UT). The UT takes a random variable x £ M n *, 
with mean x and covariance P xx . which is related to a second 
random variable y by some nonlinear function y = g(x), and 
computes the mean y and covariance P yy using a (small) set 
of deterministically selected weighted samples, called sigma 
points [6], X 1 denotes the ?'th sigma point from x and uf 
denotes its weight. The sigma points are always chosen such 
that the mean and covariance match those of the original 
distribution, x and P xx . Each sigma point is passed through 
g to obtain new sigma points y, i.e., 

y i = g(x i ) (id 

with mean and covariance calculated as 


The auxiliary variables mfc and Pfc are found by solving the 
following equations over t k - i to t k with initial conditions 

m fc = nifc_i and P fe = P fc _i: 

dm(t)/dt = 2(s — l)P(t)A(f)m(f) 

+ D(i)m(t) + (s — l)P(f)b(f) + E(t) (9) 

dP{t)/dt = 2 (s - l)P(f)A(f)P(i) 

+ D(f)P(f) + P(f)D T (f) + Q(t). (10) 


Both the Kalman and Benes filters may be derived from the 
Daum filter when the assumptions for these filters are made, 
as shown in [4], The main advantage of the approach is 
that, like the Kalman filter for linear Gaussian systems, it 
is an exact filter for the class of nonlinear dynamics that 
it covers. Further, it has computational complexity on the 
order of the Kalman filter. The main disadvantage is that the 
class of nonlinear dynamics that it covers is quite restrictive 
for practical application, as we will see in Section 5. In 
addition, it covers only additive process and sensor noise 
that are of Brownian and Gaussian distributions, respectively. 
This, however, is acceptable for many real systems. Another 
disadvantage is that the Daum filter requires the (offline) 
analytical solution of a PDE (ip(x, /;)) to both check that the 
model in question satisfies the constraints of the filter, and 
for use within the filter equations. It is quite possible that a 
solution to this PDE does not exist or cannot be found with 
the available tools. 


y = J2 wlyl (12) 

i 

P^E^-yX^-yf- (13) 


Several methods exist for selecting sigma points, but we 
describe here only the commonly used symmetric unscented 
transform, in which 2 n x + 1 sigma points are selected sym- 
metrically about the mean in the following way [12]: 


X 1 


K 

(n x + re) ’ 

2 (n x + re) ’ 


i = 0 

i — 1, . . . , 2 n x 



(14) 


(15) 


where \y/{n x + k)P xx J refers to the ith column of the 

matrix square root of ( n x + n)lP xx (e.g., computed using the 
Cholesky decomposition). The number k is a free parameter 
that can be used to tune the higher order moments of the 
distribution. Note that the sigma point weights do not directly 
represent probabilities, so are not restricted to the interval 
[0, 1], If x is assumed Gaussian, then selecting re = 3 — n x is 
recommended [6], A smaller value of re will bring the sigma 
points closer together than a larger value. 


Unscented Kalman Filter 

When exact nonlinear filters are not applicable, approximate 
methods are required. The most familiar is the extended 
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The UKF assumes the general nonlinear form of the state 
and output equations, but is restricted to additive Gaussian 
noise. It follows the same basic form as the Kalman and 



extended Kalman filters, modified to use the sigma points. 
First, n s sigma points Xk-\\k-i are derived from the current 
mean Xfc_i|fc_i and covariance estimates Pfe_ 1 i/ C _ 1 using the 
sigma point selection algorithm of choice. The prediction 
step is: 



k\k-l 

^{^k— l\k— 1 5 ^-k— 1 ) i i 1 5 • • • ) 

(16) 


y k\k — 1 

= h (^L|fc-i)^ = 1 

(17) 


Xfc|fc -1 

= jr W ‘xi lk -i 

i 

(18) 


yfc|fc-i 

n s 

i 

(19) 

Pt 

e|k— 1 — 

Q+ 



/ It s 

^ ^ ^ k\k—l ~ *-k\k-l)(X k\k-l *-k\k— l) > 

i 

(20) 

where Q is the process noise covariance matrix. The update 

step is: 



P 

r yy 

= R + 

n s 

Y2 w \yi\ k -i -yfc|fc-i)(K|fc-i -Yk\k- 

-if 



i 

(21) 

p 

* xy 

It'S 

^ ^ ^ k\k — 1 *-k\k— l)(y k\k— 1 Y k\k— l) 



i 


(22) 

Kfc 

— p pi 

r xy r yy 

(23) 

■X-k\k 

^-k\k- 

-i + K*;(yfc — yfc|fc_i) 

(24) 

*k\k 

~^*k\k- 

-1 ~ KfcPyyK fc , 

(25) 


where R is the sensor noise covariance matrix. 


Particle Filter 

The particle filter is the most general nonlinear filter, as it may 
be directly applied to nonlinear systems with non-Gaussian 
noise terms [7], In particle filters, the state distribution is 
approximated by a set of discrete weighted samples, called 
particles. The particle approximation to the state distribution 
is given by 

{ x k> w k}iL l. ( 26 ) 

where N denotes the number of particles, and for particle 
i, x[. denotes the state vector estimate and w'j, denotes the 
weight. The posterior density is approximated by 

N 

p(*k\yo:k) ~ ^wlS^dxk), (27) 

i = 1 

where <5 * (chtk) denotes the Dirac delta function located at 

X* 

x fc- 

The PF has many variants. Here, we describe the sampling 
importance resampling (SIR) particle filter, using systematic 
resampling [13]. The pseudocode for a single step of the SIR 
filter is shown as Algorithm 1. Each particle is propagated 
forward to time k by sampling new states using the model. 
The particle weight is assigned using y/, ; . The weights are 
then normalized, followed by the resampling step [7] . 


Algorithm 1 SIR Filter 

Inputs: {x^._ 1 ,w]._ 1 }[I 1 ,u fc _ 1;fc ,y /! 

Outputs: 

for i = 1 to N do 

Xfe ~p(xfc|xk_ 1 ,u fe _i) 

<- p(yfc|xLufc) 

end for 

N 

w <— y] wi 

i= 1 

for i = 1 to N do 

wi <- wi/w 

end for 

{ x L«4}£i Resample({x l fe ,w] : } l J I 1 ) 


Variance Control 

In each of these algorithms, only state estimation is described. 
Joint state-parameter estimation can be accomplished by 
augmenting the state vector with the unknown parameters. 
To do this, some type of evolution must be assigned to the 
parameters. The typical solution is to use a random walk, 
i.e., for parameter 0 , 0 k = Ok- 1 + £k- 1 > where (,k-i is 
sampled from some distribution (e.g., zero-mean Gaussian). 
In the PF, this is represented explicitly. In the Daum filter 
and the UKF, this is represented by setting the corresponding 
diagonal element of the process noise matrix (G for the Daum 
filter and Q for the UKF) to a nonzero value. In this way, the 
parameter estimates become time-varying and are modified 
by the filter using the measured outputs. 

The selected variance of the random walk noise determines 
both the rate of parameter estimation convergence and the 
estimation performance once convergence is achieved. A 
large random walk variance gives fast convergence but track- 
ing with too wide a variance, but too small a random walk 
variance will give very slow convergence, if at all, but, 
once achieved, tracking will proceed with a small variance. 
In prognostics, since wear parameter values are generally 
unknown, except possibly for the order of magnitude, it 
becomes very difficult to tune the variance value to get the 
best performance. This has resulted in several approaches that 
attempt to tune this variance value online, e.g., [8, 10, 14], We 
generalize our previous approach to this problem described 
in [10], which was developed only for particle filters, and 
extend it to work with other nonlinear filters. 

In this approach, the algorithm tries to control the variance 
of the hidden wear parameter estimate to a user-specified 
range by modifying the random walk noise variance. Since 
the random walk noise is artificial, we should reduce it as 
much as possible, because this uncertainty propagates into the 
EOL predictions. So, controlling this uncertainty also helps 
to control the uncertainty of the EOL prediction. 

The algorithm for the adaptation of the random walk variance 
vector, vt, is given as Algorithm 2. We assume that the 
distributions that the elements of £ are drawn from can be 
specified using a variance value, and that the variance values 
are tuned initially based on the maximum expected wear 
rates, e.g., if the pump is expected to fail no earlier than 100 
hours, then this corresponds to particular maximum wear rate 
values. Basically, the algorithm computes the actual spread 
of a parameter estimate from the filter estimate, and computes 
the error with the desired level of spread. The variance value 
for that parameter is then modified to reduce the error. We 
use a relative measure of spread, such as relative standard 
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Algorithm 2 vc Adaptation 


Inputs: p(xfc, Ok\yo-.k) 

State: v 5ife _i, 1 1 — 1 
Outputs: vs t 

for all j 6 {1,2,,.., ne} do 

Vj <- RelativeSpread(p(0 fc (j)|yo:fc)) 
if Vj < t.,(s(j)) then 

s(j) <- s (i) + 1 

end if 


vc,fc(i) <- €fc-iC?) 


1 + Pi(s(i)) 


^ - v j( s (i)) \ 
v j(s(j)) 7 


end for 


v 5ifc _i <s- 


deviation or relative median absolute deviation, which can be 
treated equally for any wear parameter value. 

The adaptation proceeds in multiple stages, maintained with 
an s j variable for each parameter (with j referring to the 
parameter index), with the number of stages specified by Sj . 
The Sj values are initialized to 1. Each stage is specified 
using three variables, (i) a lower threshold that, once crossed, 
signals that the next stage should be entered, (ii) the desired 
spread value for the stage, and (iii) a proportional gain term 
used to control the degree of adaptation during that stage. For 
each parameter, the threshold vector tj, the desired spread 
vector v*, and the proportional gain vector P y are of size Sj. 

The algorithm works as follows. For each parameter, indexed 
by j, the current relative spread is computed as v 3 using the 
metric of choice. If this value is below the threshold value 
for the the current stage, tj(s(j)), then the stage number is 
increased. Then the new random walk variance vc k {j) is 
computed. The error between the the actual and the desired 
spread value for the current stage, Vj — v*(s(j))> is normal- 
ized by v* (s(j))- This normalized error is then multiplied by 
the proportional gain term for the current stage, Pj(s(j)), 
and the corresponding variance v^ jfe _-| (j) is increased or 
decreased by that percentage to compute the new variance 
value v 5 )fc (j). 

Because there is some inertia to the process of Vj changing 
in response to a new value of vc fc (j), the gains Pj cannot be 
too large, otherwise Vj will not converge to the desired value, 
instead, it will continually shrink and expand. So, tuning of 
these gains and the other algorithm parameters is necessary. 


4. Prediction 

Prediction is initiated at a given time kp. Using the cur- 
rent joint state-parameter estimate, p(jx.k P ,9k P |yo : fcp), which 
represents the most up-to-date knowledge of the system 
at time kp, the goal is to compute p(EOLk P \yo:k P ) and 
p{RUL kp |y 0; fcp). 


Algorithm 3 EOF Prediction 

Inputs: {{^i p ,9i p ),wi p }f =1 
Outputs: {EOL\ p ,wl p }i =1 

for i — 1 to N do 

k t — k p 
x fc x fe P 

ei «- o\ p 

while Tbol( x L, 0\) = 0 do 

Predict fu- 

oi +1 ~ P (o k+ 1 \oi) 

x fc+ 1 ~ p( x fc+l| x fc, Ufc) 

k <— k + 1 

x fc x fc + i 

ei i- 0’ fc+ i 

end while 

EOLi p +- k 

end for 


deterministically select a small number of samples [15]. 

Given the finite set of N samples, {(xj [ p ,9\. p ),w l kp }^Lj, 
we simply propagate each sample i out to EOF and use the 
original sample weight for the weight of that EOF prediction. 
Each sample is simulated forward to EOF to obtain the com- 
plete EOL distribution. The pseudocode for the prediction 
procedure is given as Algorithm 3 [16], Each sample i is 
propagated forward until Tpo l (xj, ,9' k ) evaluates to 1; at 
this point EOL has been reached for this sample. In general, 
prediction requires hypothesizing future inputs of the system, 
fi/,, but in this paper, we assume future inputs are known. 


5. Case Study 

We adopt a centrifugal pump as a simulation-based case 
study. In this section, we review the pump model (originally 
described in [10]) and present a number of prognostics exper- 
iments. The pump model does not satisfy the requirements for 
the Daum filter, e.g., it has square roots and states appearing 
in the denominator, therefore we apply only the unscented 
Kalman filter and the particle filter. We compare the results 
of these algorithms on the case study. In both cases, we 
apply the variance control algorithm using relative standard 
deviation (RSD) as the measure of spread. 

Pump Modeling 

A schematic of a typical centrifugal pump is shown in Fig. 2. 
Fluid enters the inlet, and the rotation of the impeller, driven 
by a motor, forces fluid through the outlet. Radial and thrust 
bearings help to minimize friction along the shaft, and are 
lubricated by oil in the bearing housing. A seal prevents 
fluid flow into the bearing housing, and wear rings help to 
minimize internal leakage from the outlet to the inlet side of 
the impeller. 


The representation of p(x-k P ,9k P \yo:k P ) is dictated by the 
selected filter for damage estimation. In the case of the 
UKF and PF, the distribution is represented by a finite set of 
weighted samples. If the distribution is given in an analytical 
form or represented by sufficient statistics (e.g., the mean and 
covariance matrix), then generally the distribution cannot be 
analytically propagated to EOL due to the nonlinearity of the 
model, therefore, we must sample from these representations. 
Either random sampling of a sufficient number of samples 
can be performed, or the unscented transform can be used to 


The state of the pump is given by 

x(t) = [w(t) T t (t) T r (t) T 0 (t )] r , (28) 

where uj{t) is the rotational velocity of the pump, T t (t) is 
the thrust bearing temperature, T r (t) is the radial bearing 
temperature, and T a (t) is the oil temperature. 

The rotational velocity of the pump is described using a 
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Outlet 



torque balance. 


u=j{T e (t)-ru>(t)-T L (t))t (29) 

where J is the lumped motor/pump inertia, r e is the elec- 
tromagnetic torque provided by the motor, r is the lumped 
friction parameter, and ry is the load torque. In an induction 
motor, a torque is produced only when there is a difference, 
called slip, between the synchronous speed of the supply 
voltage, lo s and the mechanical rotation, u>, defined as 


L0 S 

Torque r e is expressed from an equivalent circuit represen- 
tation for a three-phase induction motor, based on rotor and 
stator resistances and inductances, and the slip s [17]: 


where q is a flow coefficient. The discharge flow, Q, is then 

Q Qi Q l • 


Pump temperatures are often monitored as indicators of pump 
condition. The oil heats up due to the radial and thrust 
bearings and cools to the environment: 

To = - T 0 ) + H 0 , 2 (T r - T 0 ) - iT 0j3 (T 0 - T 0 )), 

J O 

(36) 

where J Q is the thermal inertia of the oil, and the H ot terms 
are heat transfer coefficients. The thrust bearings heat up due 
to the friction between the pump shaft and the bearings, and 
cool to the oil and the environment: 

f t = J(r t u 2 - H t i(T t - To) - H t , 2 {T t - T a )), (37) 

■h, 

where J t is the thermal inertia of the thrust bearings, r t is the 
friction coefficient for the thrust bearings, and the H t l terms 
are heat transfer coefficients. The radial bearings behave 
similarly: 

f r = j- (ryw 2 - H r i(T r - To) - H r2 [T r - T a )) (38) 

where J r is the thermal inertia of the radial bearings, ry is the 
friction coefficient for the radial bearings, and the H r i terms 
are heat transfer coefficients. Note that r t and r r contribute 
to the overall friction coefficient r. 


= npR 2 wL, 

suy (i?i + R 2 /s ) 2 + (uj s Li + lo s L 2 ) 2 

where Ri is the stator resistance, L\ is the stator inductance, 
R 2 is the rotor resistance, L 2 is the rotor inductance, n is the 
number of phases, and p is the number of magnetic pole pairs. 
Rotor speed is controlled by changing the input frequency ui s . 

The load torque 77 , is a function of the flow rate through the 
pump and the impeller rotational velocity [18, 19]: 

tl = a 0 cj 2 + aiUjQ - a 2 Q 2 , (32) 

where Q is the flow, and « (J , ai, and a 2 are coefficients 
derived from the pump geometry [19]. 

The rotation of the impeller creates a pressure difference from 
the inlet to the outlet of the pump, which drives the pump 
flow, Q. Pump pressure is computed as 

p p = Aui 2 + hujQ - b 2 Q 2 , (33) 

where A is the impeller area, and b\ and b 2 are coefficients 
derived from the pump geometry. Flow through the impeller, 
Qi, is computed using the pressure differences: 

Qi = cij\ps +p p — pd\sign(p s +p p - pf, (34) 

where c is a flow coefficient, p s is the suction pressure, and 
Pd is the discharge pressure. The small (normal) leakage flow 
from the discharge end to the suction end due to the clearance 
between the wear rings and the impeller is described by 

Qi = ci\J\p d -p s \sign(p d ~p s ), 


The overall input vector u is given by 

u (t) = [p s (t) p d (t) T a (t) V{t) w s (f)] T . (39) 

The measurement vector y is given by 

y (t) = [w(t) Q{t ) T t (t) T r (t) T 0 (t)f. (40) 


The performance constraints of the pump are specified by 
efficiency and temperature limits: 


c 1 

77 (f) > 77 

(41) 

c 2 

To(t) < T+ 

(42) 

C 3 

T t (t) < T+ 

(43) 

c 4 

Tr(t) < T+, 

(44) 


where the — subscript denotes a minimum and the + su- 
perscript denotes a maximum, and efficiency rj is defined as 
77 = ( VI)/((pd — p s )Q)- We take = 0.75rjo, where 
is the nominal efficiency. When the maximum temperatures 
are reached, irreversible damage occurs. Here, we use Tf = 
333 K and T+ = T+ = 370 K. 

The most significant damage mechanism for pumps is im- 
peller wear, represented as a decrease in impeller area A [20, 
21]. Since the impeller area is proportional to bo, a decrease 
causes a decrease in the pump pressure, and, hence, the pump 
efficiency. We use the erosive wear equation [22] to describe 
how the impeller area changes over time [ 10 ]: 

A(t) = -w A Qi(t) 2 , 


(35) 
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Table 1. Estimation and Prediction Performance for UKF 


n 

PRMSE 

w A PRMSE,,,, 

PRMSE^ 

RSD TOa 

RSD,„ t 

RSD,!,^ 

RA 

RSD R ul 

i 

2.44 

1.95 

3.29 

10.97 

10.05 

9.40 

97.42 

8.39 

10 

3.36 

2.77 

3.74 

10.56 

9.77 

9.54 

96.67 

8.46 

100 

4.26 

2.56 

4.48 

11.27 

9.68 

9.77 

95.65 

9.15 

1000 

3.82 

3.41 

4.17 

11.91 

10.26 

11.41 

94.84 

10.47 

Table 2. Estimation and Prediction Performance for PF 

n 

PRMSE 

WA PRMSE,,,, 

PRMSE^ 

RSDu, a 

RSD„, t 

RSD u , r 

RA 

RSD^t/i 

i 

4.32 

5.73 

3.65 

12.73 

11.77 

11.27 

96.95 

10.65 

10 

4.99 

3.05 

3.43 

12.63 

11.71 

11.02 

96.38 

10.68 

100 

11.77 

4.33 

7.27 

12.52 

12.22 

10.90 

92.47 

10.71 

1000 

20.54 

10.25 

19.35 

12.43 

11.30 

11.43 

71.52 

13.29 


where wa is a wear coefficient. Because A is proportional to 
bo, then b 0 (t) = kA{t) = —kwAQi(t) 2 , so we estimate b 0 {t) 
and Wb 0 = kvuA- 

Another significant damage mechanism for pumps is bearing 
wear, captured as an increase in the friction coefficient. 
Sliding and rolling friction generate wear of material which 
increases the coefficient of friction [10,22]: 

r t {t) = w t r t oj 2 (45) 

r r (t) = w r r r ur, (46) 

where wt and w r are the wear coefficients. 

Results 

We performed a number of simulation experiments in which 
the values of wear parameters for the different damage pro- 
cesses were selected randomly in [1 x 10~ 3 ,4 x 10~ 3 ] at 
increments of 1 x 10 -3 for Wb 0 , in [1 x 10 — 11 , 7 x 10 -11 ] 
at increments of 0.5 x 10~ n for w t and w r , such that the 
maximum wear rates corresponded to a minimum EOL of 20 
hours. The filters had to jointly estimate the state and these 
unknown parameters. Both algorithms used the variance 
control algorithm, controlling the RSD of the hidden wear 
parameter estimates, with Sj = 2 and v* = [50, 10], T ;/ = 
[60,0], and Pj = [1 x 10~ 3 ,1 x 10 -4 ] for all j. In order 
to investigate how the algorithms performed under different 
noise levels, we varied the sensor noise variance by factors 
of 1, 10, 100, and 1000, and performed 15 experiments for 
each case, averaging results over each case. The future input 
of the pump was considered to be known, and it was always 
operated at a constant RPM. Hence, the only uncertainty 
present is that involved in the noise terms and that introduced 
by the filtering algorithms. 

The estimation performance is calculated using percent root 
mean square error (PRMSE) for accuracy of the wear param- 
eter estimates, and RSD for spread. Note that the variance 
control algorithm attempts to control the RSD of each wear 
parameter estimate to 10%, so ideally the computed RSD 
would be exactly that. The RUL prediction performance is 
calculated using relative accuracy (RA) [11] for accuracy (a 
score of 100% is best) and RSD for prediction spread. The 
prediction metrics are computed at each prediction point and 


averaged over an entire experiment. 

The results for the UKF are summarized in Table 1 . The wear 
parameters were estimated with very good accuracy, and with 
spread around the desired 10%. As sensor noise increased 
(denoted by the column labeled n in the table), estimation 
performance generally became worse in both accuracy and 
precision, but the variance control algorithm was still able to 
keep RSD near the desired level, even when the sensor noise 
variance was increased by a factor of 1000. The good estima- 
tion performance translated to good prediction performance 
since there was no uncertainy in the future inputs. Both 
RA and RSD of the predictions were good and performance 
decreased with increased noise, as expected. 

The results for the PF are summarized in Table 2. The 
PF used N = 500 particles and assumed the sensor noise 
variance was 10 times the actual value (this is sometimes 
called a “roughening penalty” and in this case improves 
convergence and accuracy). Similar trends are observed 
here, however, the accuracy is decreased and the variance 
control algorithm has a more difficult time controlling RSD. 
As sensor noise increases, the PF has a more difficult time 
converging, and when the sensor noise variance is increased 
by a factor of 1000, some experiments did not converge, 
resulting in very poor estimation and prediction and bringing 
the overall average down. The UKF, on the other hand, easily 
converged with all noise levels considered. Performance of 
the PF could possibly be improved by increasing the number 
of particles and/or changing the parameters of the variance 
control algorithm. 

The performance of the two filters is easily compared visu- 
ally. Here, we choose the case where Wb 0 = 2 x 10 -3 , 
wt = 3 x 10“ n , and w r = 1 x 10~ n . Estimation 
results for the UKF are shown in Fig. 3 and results for the 
PF are shown in Fig. 4. We show the true values, denoted 
with the * superscript, the mean estimate, and the minimum 
and maximum values of the samples. The spread of the 
UKF estimate appears significantly smaller, but this is only 
because the sigma points represent the distribution with a set 
of sigma points whose minimum and maximum values are 
very close to the mean, whereas with the PF, the samples 
are stochastically selected, so the minimum and maximum 
values may be far from the mean. In both cases the effects 
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Figure 3. UKF estimation of pump wear parameters. 
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of the variance control algorithm are also observed, e.g., for 
w r the initial estimate has a very large variance, but quickly 
decreases to the desired level of relative spread. Convergence, 
though, is clearly better with the UKF. 
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Figure 5. a-X performance with a = 0.1 and f3 = 0.5 for 
the UKF. 
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Figure 6 . a-X performance with a = 0.1 and f3 = 0.5 for 
the PF. 


The prediction results for these cases are shown in Fig. 5 for 
the UKF and Fig. 6 for the PF using an a-X plot [11], With 
the a-X metric, we check at each prediction point whether 
f3 of the distribution lies within a of the true RUL. In the 
figure, the accuracy bound defined by a = 10% is shown 
as a gray cone, and we select /3 = 50%. For the UKF, the 
metric is satisfied at all prediction points, with about 70% 
of the distribution falling within the accuracy bound, with 
the mean always falling within the bound. For the PF, the 
mean always falls within the bound, but the portion of the 
distribution falling within the bound is consistently less than 
with the UKF, and the metric fails at the third prediction point. 

Overall, the UKF consistently performs better than the PF 
for this case study. Further, it does this at a much reduced 
computational cost. The complete state-parameter vector 
has 11 variables, so the UKF with the symmetric transform 
requires 2x11 + 1 = 23 sigma points, whereas the PF used 
500 particles. This also translates to computational savings 
in the prediction algorithm, because ( i ) only 23 simulations 
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have to be performed compared to 500, and (ii) the minimum 
values are larger for the UKF than for the PF, and the time 
to EOL (and thus the time to simulate to EOL) is inversely 
proportional to the wear rate value. So, prediction using the 
sigma points is more efficient than using the particles. Note, 
however, that the unscented transform may be applied to the 
particle distribution and the resulting sigma points can be 
simulated forward instead as shown in [15]. 

Both filters require some degree of tuning. The UKF is 
generally easier to tune because it has fewer free parameters. 
The most difficult aspect can be selecting the parameters of 
the unscented transform. In this case, we used k = 3 — n x = 
3 — 11 = —8, as recommended in [6], which worked quite 
well. With the PF, the number of particles and the roughening 
penalties must be tuned, but some general heuristics are 
available [7]. 

It is expected that the Daum filter, when applicable, can 
improve performance over approximate filters like the UKF 
and PF. However, its applicability for prognostics is appar- 
ently limited, because many damage progression processes 
are nonlinear and may not meet the rather strict requirements 
on the model dynamics. This problem becomes more likely 
when joint state-parameter estimation must be performed, 
as introducing unknown parameters as states increases the 
amount and types of nonlinearities. For example, in the dam- 
age progression equation 45 for ft, we have effectively three 
states when performing joint state-parameter estimation: r t , 
Wt, and u!. When taking jff we end up with Wt.oJ 2 which 
would violate equation 4. An approximate version of the 
Daum filter is available, but still requires solving analytically 
the PDE of equation 3. For the pump, a solution to this 
PDE could not be found by Mathematica, as it is not able 
to solve PDEs analytically for multi-dimensional systems. It 
is questionable as to whether it is worth the effort to apply 
the Daum filter when simpler approximate filters, like the 
UKF, may be used, and whether the possible improvement 
in accuracy over the UKF is significant enough to justify the 
additional filter complexity. 


6. Conclusions 

In this paper, we reviewed nonlinear filtering approaches with 
application to prognostics, including the Daum filter, the un- 
scented Kalman filter, and the particle filter. In model-based 
prognostics, joint state-parameter estimation determines the 
current health state of the system, and this is followed by 
a prediction algorithm that uses the estimated state as an 
input. We adopted a centrifugal pump as a case study, and 
in simulation demonstrated the application of the UKF and 
PF to the pump. The Daum filter had requirements on the 
model dynamics that were too strong in order to apply it 
also to the pump, and seems to have limited applicability to 
prognostics applications due to the constraints it puts on the 
model. In this case study, we found the UKF to outperform 
the PF in damage estimation, and, as a result, the predictions 
obtained were also superior in both accuracy and precision. 
The UKF was also easier to tune and had significantly lower 
computational complexity. 

In the future, we would like to further investigate the applica- 
bility of the Daum filter and its related exact nonlinear filters 
to prognostics applications. When applicable, it is important 
to compare its performance to simpler algorithms like the 
extended Kalman filter and the UKF relative to the effort in 
applying the filter. Further, it remains to be seen whether the 


approximate version of the Daum filter can outperform the 
UKF, and, if so, if the performance gain is significant enough 
to justify the increase in development effort of the filter. 
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